Accuracy of deep learning-based integrated tooth models by merging intraoral scans and CBCT scans for 3D evaluation of root position during orthodontic treatment

Objective This study aimed to evaluate the accuracy of deep learning-based integrated tooth models (ITMs) by merging intraoral scans and cone-beam computed tomography (CBCT) scans for three-dimensional (3D) evaluation of root position during orthodontic treatment and to compare the fabrication process of integrated tooth models (ITMs) with manual method. Material and methods Intraoral scans and corresponding CBCT scans before and after treatment were obtained from 15 patients who completed orthodontic treatment with premolar extraction. A total of 600 ITMs were generated using deep learning technology and manual methods by merging the intraoral scans and CBCT scans at pretreatment. Posttreatment intraoral scans were integrated into the tooth model, and the resulting estimated root positions were compared with the actual root position at posttreatment CBCT. Discrepancies between the estimated and actual root position including average surface differences, arch widths, inter-root distances, and root axis angles were obtained in both the deep learning and manual method, and these measurements were compared between the two methods. Results The average surface differences of estimated and actual ITMs in the manual method were 0.02 mm and 0.03 mm for the maxillary and mandibular arches, respectively. In the deep learning method, the discrepancies were 0.07 mm and 0.08 mm for the maxillary and mandibular arches, respectively. For the measurements of arch widths, inter-root distances, and root axis angles, there were no significant differences between estimated and actual models both in the manual and in the deep learning methods, except for some measurements. Comparing the two methods, only three measurements showed significant differences. The procedure times taken to obtain the measurements were longer in the manual method than in the deep learning method. Conclusion Both deep learning and manual methods showed similar accuracy in the integration of intraoral scans and CBCT images. Considering time and efficiency, the deep learning automatic method for ITMs is highly recommended for clinical practice.


Introduction
Three-dimensional (3D) digital datasets have become the standard. As previous studies have reported the integration of different imaging modalities [1][2][3], the procedure of treatment planning and evaluation in orthodontics and maxillofacial surgery have shifted to a digital 3D method. However, a cone-beam computed tomography (CBCT) scan has a limitation of occlusal visualization. Most researchers and clinicians agree that CBCT scans do not provide enough detailed information about dentition and interocclusal relationships for treatment planning, owing to the limited scanning resolution and streak artifacts caused by radiopaque dental restorations or orthodontic braces [4][5][6][7][8][9]. Moreover, taking CBCT scans during or after treatment puts patients and clinicians at risk of radiation exposure. Repeated CBCT scans would expose the patient to higher levels of radiation; this is not recommended clinically, especially in children [10][11][12].
Lee et al. [13,14] reported a method for monitoring of root movement with a combination of CBCT at pretreatment and laser-scanned model at posttreatment. The method used in their study is time-consuming and requires much effort. Particularly, the process of individual tooth root isolation from alveolar bone in CBCT is technique sensitive and user dependent. Considering the resolution and noise of images and touching adjacent tooth, touching tooth and bones, tooth segmentation using manual threshold adjustment [15] and region growing [13] might be challenging when handling complex image conditions as well as root-branching problems. Considering that the tooth root isolation is an important step in tooth model fabrication, accurate segmentation is essential. Moreover, a method that isolates the tooth including the root from CBCT images without removing the alveolar bone is preferable. This study aimed to evaluate the accuracy of 3D virtual tooth models and compare deep learning and manual methods in terms of the fabrication process.

Material and methods
This study was approved by the institutional review board of Chonnam National University, Gwangju, Korea (CNUDH-EXP-2019-018), in compliance with the principles of the Declaration of Helsinki. Patient records for this study were obtained from the patient database at the Department of Orthodontics at Chonnam National University Dental Hospital. The study protocol is shown in Fig. 1. The inclusion criteria were as follows: (1) patients who have completed orthodontic treatment with premolar extraction and had intraoral scans and (2) patients with available intraoral CBCT scans at the pre-and posttreatment stages. The exclusion criteria were as follows: (1) patients who have undergone restorative treatments during orthodontic treatment; (2) patients who had restorative or prosthodontic treatment covering more than two surfaces during treatment; (3) patients who were treated with interproximal reduction during the treatment; and (4) patients who have dilacerated roots or severe root resorptions. The sample size calculation was performed according to the results of a study by Lee et al. [14]. In their study, mean difference of buccolingual inclination measurements was 1.30 ± 0.92. The effect size was calculated at 1.41. A statistical power and type I error of 80% and 5%, respectively, were assumed using G*power (version 3.1.9.2, Heinrich Heine University, Dusseldorf, Germany). The calculation indicated that five individuals were required in the study. In this study, 15 patients were included. A total of 600 teeth (300 teeth with central and lateral incisors, canines, second premolar, and first molars in the pretreatment stage; 300 teeth with central and lateral incisors, canines, second premolar, and first molars in the posttreatment stage) were included.
To generate the 3D tooth model at pretreatment, intraoral scans and its corresponding CBCT scans at pretreatment are required. Intraoral scans of maxillary and mandibular arches were obtained using Trios scanner (3Shape, Copenhagen, Denmark). The intraoral scans were trimmed to the clinical crown by deleting the gingival area, and they were submitted to the OrthoAnalyzer (3Shape) program and reprocessed as a stereolithography (STL) file format. CBCT scans were taken with an Alphard Vega scanner (Asahi Roentgen Co. Kyoto, Japan) set at a field of view of 200 × 179 mm 2 , 80 kV, 5 mA, and a voxel size of 0.39 mm. Automatic and manual methods were used for the individual tooth segmentation for all teeth from the pretreatment and posttreatment CBCT scans. Mimics (version 23.0; Materialise, Leuven, Belgium) was used for manual method. Individual tooth segmentation was performed by extracting and isolating individual teeth from the surrounding alveolar bone using region-growing tool in the program. For automatic method, 3D tooth modeling using convolutional neural network (CephX, ORCA Dental AI Inc., Herzliya, Israel) was used to generate all teeth segmentations from pretreatment and posttreatment CBCT scans (Fig. 2).
Intraoral scans and segmented individual CBCT teeth were both imported into the 3D reverse engineering software (Rapidform 2006, INUS Technology, Seoul, Korea), and the 3D tooth model at pretreatment was fabricated through the integration process of the pretreatment CBCT-scanned tooth root and intraoral-scanned tooth crown. The first step in this integration process was to use "initial registration" function to grossly pick three corresponding points on both the crown of the pretreatment CBCT tooth and its corresponding pretreatment intraoral scans. The central fossa of the right and left second molars and midpoint of central incisors were selected as the three corresponding points. To maintain the orientation of CBCT scans, pretreatment CBCT teeth were locked in the software. Then, a "regional registration" function was used to finalize the best-fit registration, which uses an iterative closest algorithm. The regions of interest include labial surfaces of six anteriors and buccal surface of molars. Next, by changing the color opacity and being translucent, the crown area of CBCT tooth was erased to be replaced with its corresponding intraoral scan. Finally, the 3D tooth model, composed of CBCT-scanned root and intraoral-scanned crown, was generated by integrating these two imaging modalities using the "merge" function.
To estimate the posttreatment root position, individual tooth model at pretreatment was superimposed to the posttreatment intraoral scans. Both the individual tooth models at pretreatment and posttreatment intraoral scans were imported into the software. Initial registration and regional registration processes were used to superimpose two imaging modalities. Since the crown morphology is identical in principle, the only initial registration using picking several points was enough, but regional registration was added for a more accurate superimposition process.
To determine whether the resulting estimated position of the root would agree with the actual root position at posttreatment CBCT, the discrepancy of the estimated and actual root position was calculated. First, the average surface discrepancy of estimated and actual tooth model in each automatic and manual method was calculated. Furthermore, the average surface discrepancy between the estimated and actual tooth models was compared between automatic and manual methods. For quantitative evaluation, 3D Euclidean inter-root distances between proximal teeth, arch widths, and root axis angle to occlusal plane were obtained from the estimated and actual tooth models, respectively. 3D Euclidean inter-root distances were obtained from distances between proximal teeth and arch widths were obtained from inter-incisor width, inter-canine width, inter-premolar width, and inter-molar width (Fig. 3). The distances were obtained by using the function of pointto-point distances in the software, and root apices were used for these points. Moreover, these values were compared between automatic and manual methods. Furthermore, the difference between automatic and manual methods in terms of individual tooth segmentation was evaluated. The time taken to obtain these measurements was also recorded and compared between both methods. All these processes were conducted by an experienced single researcher with over 10 years of experience in this field, who graduated from a dental school and completed a Ph.D. program in orthodontics.

Statistical analysis
The means and standard deviations of the measurements in the estimated and actual tooth models and manual and automatic methods. Shapiro-Wilk test for normal distribution of the differences accepted the normality; thus, a paired t test was used to analyze the differences between the estimated and actual tooth models and between the manual and automatic methods. Statistical analysis was performed by using SPSS (version 26.0; IBM SPSS, Armonk, NY). Statistical significance was set at p < 0.05. Intra-examiner repeatability was evaluated using intraclass correlation (ICC) analysis by repeating all measurements from five randomly selected individuals after 4 weeks. The ICC values were 0.857-0.939 and

Results
The average surface differences between estimated and actual ITMs in the manual method were 0.02 mm and 0.03 mm for the maxillary and mandibular arches, respectively. In the automatic method, the discrepancies were 0.07 mm and 0.08 mm for the maxillary and mandibular arches, respectively. However, there was no significant difference between the two methods ( Table 1).
For the measurements of arch width, inter-root distance, and root axis angle, there were no significant differences between the estimated and actual models both in the manual and in the automatic methods except for some measurements (Tables 2, 3, 4, 5, 6, and 7). The premolar widths in both maxilla and mandible showed significant differences between estimated and actual models by means of the manual method. The difference of interpremolar width between the estimated and actual models was 1.1 mm and 1.7 mm in the maxilla and mandible, respectively ( Table 2). The inter-root distance between mandibular second premolar and canine showed a significant difference of 1.0 mm between the estimated and actual models by means of automatic method ( Table 5). The root axis angle of maxillary first molar and mandibular incisors showed significant differences between the estimated and actual models by means of the automatic method. The difference was 2.4° in the maxillary first molar and 2.8° in the mandibular incisors (Table 7).
Comparing the manual and automatic methods, only three measurements, namely inter-root distance between right maxillary first molar and second premolar, left maxillary canine and lateral incisor, and root axis angle of mandibular incisors, showed significant difference between the two methods (Tables 8, 9, and 10).
The procedure time taken to obtain the measurements was longer in the manual method than in the automatic method. Individual tooth segmentation using Mimics program took 15 min for each tooth, and 24 teeth including first molar required 6 h. Individual tooth segmentation using automatic method took 1 min for each tooth, and 24 teeth including first molar required 24 min without manual labor.

Discussion
Lee et al. [13,14,16,17] attempted to combine a CBCT image and laser-scanned models for evaluating the root position at different stages of orthodontic treatment and reported that tooth root position could be predicted from combination with pretreatment CBCT images and posttreatment laser-scanned model images. However, pretreatment CBCT images do not provide detailed Table 1 Average surface difference of estimated and actual tooth model in each deep learning and manual method and its comparison between the two methods (unit: mm) Data show the average surface discrepancy (mm) between the estimated and actual model in each deep learning and manual method by means of shell/shell deviation in the program. SD, Standard deviation.  information of crown morphology and occlusion. When interdigitation between maxillary and mandibular dentition is tight or crown restorations are present, artifacts of CBCT images occur; thus, it is difficult to integrate the two imaging modalities, i.e., CBCT and laser-scanned model imaging. Inaccuracy of integration of two imaging modalities affects final predictability of tooth root position. Therefore, in this report, from the beginning  stage before orthodontic treatment, pretreatment 3D tooth models were generated from pretreatment CBCT and pretreatment intraoral scans. The tooth roots were obtained from the CBCT image, and the tooth crowns were obtained from the intraoral scans, thereby minimizing the possibility of overlapping error (integration error) due to the artifacts of the crown appearing in the CBCT image at the pretreatment stage. Tooth segmentation is an important step for fabricating the individual tooth model, of which accurate  segmentation is essential. Various computer algorithms for automatic tooth segmentation have been proposed, and some software programs for automatic segmentation have been released in dentistry. Hence, a method that isolates the tooth including the root from the alveolar bone in CBCT images without removing the alveolar bone is preferable. The software used in this study is generally used for processing medical images and creating 3D models. Unlike medical segmentation process of other anatomic structures such as the pelvic bone or heart, tooth segmentation from alveolar bone was difficult. Basically, the program performs segmentation by differentiating and taking different levels of multiple anatomic structures. However, the contrast level of the tooth and alveolar bones is similar, and differentiating between the two is hard for the software due to very narrow periodontal ligament space between tooth and alveolar bone. Thus, fully automatic segmentation was impossible for isolating tooth from the alveolar bone. The program's automatic segmentation function, regiongrowing, was primarily used for rough segmentation, and it was adjusted manually for accuracy using the slice edit tool. The region-growing tool provides the capacity to split the segmentation into separate objects. Morphology operation prior to region-growing was done on all slices to take the intrinsic tooth structure from the bone. There  are two options in the function for 8-connectivity and 26-connectivity; 26-connectivity was applied to select the pixels in the boundary of the structure considering neighboring pixels in 3D. One study about the segmentation method of watershed transformation reported that the proper selection of the segmentation threshold is critical for CBCT images with a low contrast and high noise level [18]. Ye et al. [9] evaluated the integration accuracy of CBCT images and dental model according to segmentation threshold settings, and they found that the accuracy  of the integration of laser-scanned dental models into CBCT images is higher with a high-relative Hounsfield unit threshold setting in 0.20 and 0.40 mm voxel sizes [9]. In order to estimate the posttreatment root position, individual tooth model at pretreatment was superimposed to the posttreatment intraoral scans. In other words, pretreatment individual tooth model which was fabricated by combining pretreatment intraoral scan and CBCT data was replaced with a posttreatment intraoral scan. Since the crown morphology is identical in pretreatment and posttreatment intraoral scan, the root position would be changed according to the position of posttreatment intraoral scan. Then, this changed root position was compared with the actual root position at posttreatment CBCT. The actual tooth model means the tooth model was fabricated using posttreatment CBCT data in each method. In other words, these actual tooth models were fabricated in each method; thus, there was a slight difference between the actual tooth models from each method.
In Table 2, the values of estimated models were smaller than those of actual tooth models. All but two of the estimated inter-arch widths were underestimated via the estimate model using the manual method. In contrast, in Table 3, using the deep learning method, three of the five maxillary inter-arch measures were over-estimated. The mandibular inter-arch measures were similar to the manual method where the estimated tooth model also underestimates the actual model. The possible reason of that was morphological changes of root apex after treatment. The actual models were fabricated using CBCT scan after treatment. The reason why there were underestimated values in the maxilla was that the morphological changes of root apex occurred more in the maxilla than in the mandible.
In the present study, there were differences around the second premolar area between estimated and actual tooth models. All study participants with premolar extraction had undergone first premolar extraction. The amount of tooth movement is generally large around extraction spaces. It is believed that these differences are due to the large amount of tooth movement in the second premolar area. Moreover, the root axis angle of mandibular incisors showed significant differences between the estimated and actual models using the automatic method. The root axis angle of mandibular incisors also showed significant differences in the comparison between the manual and automatic methods. The interradicular spaces around the mandibular incisors are narrow; thus, inter-radicular alveolar bone is commonly thin around the mandibular incisors. This might lead to errors in tooth segmentation process. Therefore, careful consideration of tooth segmentation is essential for this area.
In this study, the second molars were excluded, as they are rarely monitored compared to other teeth during orthodontic treatment. In addition, there are many cases in which the tooth root morphology of the second molars rather than the first molars is irregular, increasing the likelihood of errors. Furthermore, intraoral scanning inaccuracy of second molars may also lead to errors [19,20]. However, as digital technologies in orthodontic treatment including virtual setup and indirect bonding using 3D printing have become popular, the second molars are included in orthodontic treatment from the beginning of the treatment stages. Considering this, further research including the measurements of second molars is necessary.
Regarding the reproducibility of the fabrication process of tooth models, accuracy of tooth models may be affected by the construction skill and experience of the examiners. In the present study, all processes were conducted by an experienced single researcher with over 10 years of experience in this field, who graduated from a dental school and completed a Ph.D. program in orthodontics for minimizing reproducibility errors. The tooth-modeling service of CephX was used for the deep learning automatic method using convolutional neural network features. With the advent of deep learning technology, artificial intelligence technology is showing remarkable practical effects, as it can analyze and learn like a human; recognize data in text, image, and sound format; and perform image classification, segmentation, and enhancement.
In the present study, the 3D reverse engineering software (Rapidform) was used for the integration process of the pretreatment CBCT-scanned tooth root and intraoral-scanned tooth crown. For other currently available software, there are Dental Monitoring and Geomagic software (3D Systems, USA). Geomagic software mainly provides the ability to process STL or computeraided design (CAD) file format and is commonly used in the fields of making digital 3D models and CAD assemblies. One study regarding the accuracy of Dental Monitoring application reported that 3D digital dental models generated by the Dental Monitoring application in photograph and video modes were accurate enough to be used for clinical applications [21].
Deep learning offers advantages in reading medical images and diagnosing diseases. Artificial intelligence technology brings forth novel diagnostic and therapeutic systems for radiology, imaging technology, ultrasonography, and pathological diagnosis, which can improve the quality and efficiency of clinical work comprehensively. Moreover, the technology is gradually changing the traditional medical model, representing a direction and trend for future human medical development.